[SPARK-23626][CORE] DAGScheduler blocked due to JobSubmitted event#27234
Closed
ajithme wants to merge 1 commit intoapache:masterfrom
Closed
[SPARK-23626][CORE] DAGScheduler blocked due to JobSubmitted event#27234ajithme wants to merge 1 commit intoapache:masterfrom
ajithme wants to merge 1 commit intoapache:masterfrom
Conversation
|
Can one of the admins verify this patch? |
Contributor
Author
|
This PR is reviving #24438 as it was closed due to inactivity. As @squito had mentioned in the old PR about guarding partition state of RDD using a lock Refer comment: #24438 (review) , this has been accomplished by #25951 (SPARK-28917) Please review @squito @dongjoon-hyun @vanzin @srowen |
Contributor
Author
|
gentle ping @squito @dongjoon-hyun @vanzin @srowen |
|
We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
Forcing partition evaluation in
callsitethread before sendingorg.apache.spark.scheduler.JobSubmittedevent toorg.apache.spark.scheduler.DAGScheduler#eventProcessLoopcan help in mitigation against job submission event blocking theDAGSchedulerthreadWhy are the changes needed?
DAGSchedulerbecomes a bottleneck in cluster when multipleJobSubmittedevents has to be processed asDAGSchedulerEventProcessLoopis single threaded and it will block other tasks in queue likeTaskCompletion.The
JobSubmittedevent is time consuming depending on the nature of the job (Example: calculating parent stage dependencies, shuffle dependencies, partitions) and thus it blocks all the events to be processed.Similarly in my cluster some jobs partition calculation is time consuming (Similar to stack at SPARK-2647) hence it slows down the spark
DAGSchedulerEventProcessLoopwhich results in user jobs to slowdown, even if its tasks are finished within seconds, asTaskCompletionEvents are processed at a slower rate due to blockage.Refer: http://apache-spark-developers-list.1001551.n3.nabble.com/Spark-Scheduler-Spark-DAGScheduler-scheduling-performance-hindered-on-JobSubmitted-Event-td23562.html
I see multiple JIRA referring to this behavior
https://issues.apache.org/jira/browse/SPARK-2647
https://issues.apache.org/jira/browse/SPARK-4961
Does this PR introduce any user-facing change?
No
How was this patch tested?
Added UT to reproduce and evaluate fix.